Skip to content

Add status filtering to the runs endpoint#481

Open
Aryan95614 wants to merge 2 commits into
Netflix:masterfrom
Aryan95614:w2-status-filter
Open

Add status filtering to the runs endpoint#481
Aryan95614 wants to merge 2 commits into
Netflix:masterfrom
Aryan95614:w2-status-filter

Conversation

@Aryan95614

Copy link
Copy Markdown

Adds a status query parameter to GET /flows/{flow_id}/runs so a run listing can be filtered to running / completed / failed without pulling every row and filtering client-side. The parameter is repeatable (?status=running&status=failed) to match more than one status; unknown values return 400.

Status is derived from the run's end-attempt metadata using the same definition ui_backend_service already serves, so the two agree. It is applied as a WHERE predicate over the existing query path, and the joins are only enabled when a status filter is present, so the plain listing is unchanged. Because the predicate lives in the WHERE, it composes with limit/offset instead of replacing them — a filtered page is the matching rows taken after filtering.

Tested against real Postgres: status classification (completed wins over a live heartbeat, explicit attempt_ok=false, stale heartbeat, running), repeated-param OR, the 400 path, and that the filter composes with a limited page.

Known test-coverage gap: the failed-then-retrying -> running branch isn't covered yet (needs deterministic metadata timestamps in the test helper).

GET /flows/{flow_id}/runs now takes a status query parameter
(running/completed/failed), repeatable to match more than one.

Status is derived from the run's end-attempt metadata, reusing the
same definition the UI backend serves so the two agree. It is applied
as a WHERE predicate over the existing query path, so it composes with
limit/offset rather than replacing it; the join is only enabled when a
status filter is present, leaving the plain listing untouched. Unknown
status values are rejected with a 400.
Cover the first branch of the run-status CASE: a run whose 'end' step failed
(attempt_ok=false) but has a newer 'attempt' record is being retried and must read
as running, not failed. The branch turns on a strict ts_epoch comparison between the
two metadata rows, so the test stamps their timestamps explicitly via update_row
rather than relying on insert order. Completes coverage of all five status branches.

@saikonen saikonen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good so far. Let's wait for the run pagination PR to go in so this can be modified to support cursor based pagination as well.

# joins below pull the 'end' step's attempt metadata that the status depends on.
# Only used when joins are enabled (i.e. when filtering by status), so the plain
# get_all_runs path stays a simple, join-free query.
joins = [

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

battle-tested query logic from the ui_backend side, and this is gated behind the status query param, so from a performance perspective it should be fine to add to the metadata-service as well.

The only concern is with status=failed&status=running which is in the case of relying on heartbeats, not entirely deterministic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants